Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement AptManager for slurm_ops #24

Merged
merged 4 commits into from
Sep 16, 2024

Conversation

jedel1043
Copy link
Contributor

Implements an AptManager for slurm_ops that uses apt to install packages and systemctl to manage services.

@jedel1043 jedel1043 force-pushed the deb-manager branch 3 times, most recently from 7e49208 to b577b79 Compare September 12, 2024 21:11
@jedel1043 jedel1043 marked this pull request as ready for review September 12, 2024 21:15
@NucciTheBoss NucciTheBoss self-requested a review September 13, 2024 13:37
@NucciTheBoss NucciTheBoss added the enhancement New feature or request label Sep 13, 2024
Copy link
Member

@NucciTheBoss NucciTheBoss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great so far! 🤩

Just a couple comments around the implementation of _MungectlManager and _AptManager. Big thing I want to make sure that we're not doing is stepping into a minefield with overriding the systemd service file for slurmd both here and in the slurmd charm. Let me know if you have any questions.

Comment on lines 396 to 400
class _MungectlManager(MungeKeyManager):
"""Control the munge key using mungectl."""

def __init__(self, binpath: str) -> None:
self._binpath = binpath
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we need to have a separate mungectl manager here. If I understand the motivation here correctly, we need this manager because there's stratification between the deb version of Slurm and the snap version of Slurm being that the binary name for mungectl with the deb is just mungectl while the binary name in the snap is slurm.mungectl.

What I'm wondering here is if we just create an alias for mungectl upon installing the Slurm snap:

    def install(self) -> None:
        """Install Slurm using the `slurm` snap."""
        # FIXME: Pin slurm to the stable channel
        _snap("install", "slurm", "--channel", "latest/candidate", "--classic")
        # FIXME: Request automatic alias for `mungectl` so that we don't need to do this manually
        _snap("alias", "slurm.mungectl", "mungectl")

Then we can just declare a partial function for mungectl:

We could possibly do the same thing for the _snap function also since there's no more special sauce/manipulations to the output of _call being done.

_mungectl = functools.partial(_call, "mungectl")

Let me know your thoughts about doing this instead of having a dedicated MungectlManager instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sounds like a cleaner solution

tests/integration/test_hpc_libs.yaml Outdated Show resolved Hide resolved
Comment on lines 516 to 580
UBUNTU_HPC_SLURM_WLM_KEY = """
-----BEGIN PGP PUBLIC KEY BLOCK-----
Comment: Hostname:
Version: Hockeypuck 2.2

xsFNBGTuZb8BEACtJ1CnZe6/hv84DceHv+a54y3Pqq0gqED0xhTKnbj/E2ByJpmT
NlDNkpeITwPAAN1e3824Me76Qn31RkogTMoPJ2o2XfG253RXd67MPxYhfKTJcnM3
CEkmeI4u2Lynh3O6RQ08nAFS2AGTeFVFH2GPNWrfOsGZW03Jas85TZ0k7LXVHiBs
W6qonbsFJhshvwC3SryG4XYT+z/+35x5fus4rPtMrrEOD65hij7EtQNaE8owuAju
Kcd0m2b+crMXNcllWFWmYMV0VjksQvYD7jwGrWeKs+EeHgU8ZuqaIP4pYHvoQjag
umqnH9Qsaq5NAXiuAIAGDIIV4RdAfQIR4opGaVgIFJdvoSwYe3oh2JlrLPBlyxyY
dayDifd3X8jxq6/oAuyH1h5K/QLs46jLSR8fUbG98SCHlRmvozTuWGk+e07ALtGe
sGv78ToHKwoM2buXaTTHMwYwu7Rx8LZ4bZPHdersN1VW/m9yn1n5hMzwbFKy2s6/
D4Q2ZBsqlN+5aW2q0IUmO+m0GhcdaDv8U7RVto1cWWPr50HhiCi7Yvei1qZiD9jq
57oYZVqTUNCTPxi6NeTOdEc+YqNynWNArx4PHh38LT0bqKtlZCGHNfoAJLPVYhbB
b2AHj9edYtHU9AAFSIy+HstET6P0UDxy02IeyE2yxoUBqdlXyv6FL44E+wARAQAB
zRxMYXVuY2hwYWQgUFBBIGZvciBVYnVudHUgSFBDwsGOBBMBCgA4FiEErocSHcPk
oLD4H/Aj9tDF1ca+s3sFAmTuZb8CGwMFCwkIBwIGFQoJCAsCBBYCAwECHgECF4AA
CgkQ9tDF1ca+s3sz3w//RNawsgydrutcbKf0yphDhzWS53wgfrs2KF1KgB0u/H+u
6Kn2C6jrVM0vuY4NKpbEPCduOj21pTCepL6PoCLv++tICOLVok5wY7Zn3WQFq0js
Iy1wO5t3kA1cTD/05v/qQVBGZ2j4DsJo33iMcQS5AjHvSr0nu7XSvDDEE3cQE55D
87vL7lgGjuTOikPh5FpCoS1gpemBfwm2Lbm4P8vGOA4/witRjGgfC1fv1idUnZLM
TbGrDlhVie8pX2kgB6yTYbJ3P3kpC1ZPpXSRWO/cQ8xoYpLBTXOOtqwZZUnxyzHh
gM+hv42vPTOnCo+apD97/VArsp59pDqEVoAtMTk72fdBqR+BB77g2hBkKESgQIEq
EiE1/TOISioMkE0AuUdaJ2ebyQXugSHHuBaqbEC47v8t5DVN5Qr9OriuzCuSDNFn
6SBHpahN9ZNi9w0A/Yh1+lFfpkVw2t04Q2LNuupqOpW+h3/62AeUqjUIAIrmfeML
IDRE2VdquYdIXKuhNvfpJYGdyvx/wAbiAeBWg0uPSepwTfTG59VPQmj0FtalkMnN
ya2212K5q68O5eXOfCnGeMvqIXxqzpdukxSZnLkgk40uFJnJVESd/CxHquqHPUDE
fy6i2AnB3kUI27D4HY2YSlXLSRbjiSxTfVwNCzDsIh7Czefsm6ITK2+cVWs0hNQ=
=cs1s
-----END PGP PUBLIC KEY BLOCK-----
"""

UBUNTU_HPC_EXPERIMENTAL_KEY = """
-----BEGIN PGP PUBLIC KEY BLOCK-----
Comment: Hostname:
Version: Hockeypuck 2.2

xsFNBGTuZb8BEACtJ1CnZe6/hv84DceHv+a54y3Pqq0gqED0xhTKnbj/E2ByJpmT
NlDNkpeITwPAAN1e3824Me76Qn31RkogTMoPJ2o2XfG253RXd67MPxYhfKTJcnM3
CEkmeI4u2Lynh3O6RQ08nAFS2AGTeFVFH2GPNWrfOsGZW03Jas85TZ0k7LXVHiBs
W6qonbsFJhshvwC3SryG4XYT+z/+35x5fus4rPtMrrEOD65hij7EtQNaE8owuAju
Kcd0m2b+crMXNcllWFWmYMV0VjksQvYD7jwGrWeKs+EeHgU8ZuqaIP4pYHvoQjag
umqnH9Qsaq5NAXiuAIAGDIIV4RdAfQIR4opGaVgIFJdvoSwYe3oh2JlrLPBlyxyY
dayDifd3X8jxq6/oAuyH1h5K/QLs46jLSR8fUbG98SCHlRmvozTuWGk+e07ALtGe
sGv78ToHKwoM2buXaTTHMwYwu7Rx8LZ4bZPHdersN1VW/m9yn1n5hMzwbFKy2s6/
D4Q2ZBsqlN+5aW2q0IUmO+m0GhcdaDv8U7RVto1cWWPr50HhiCi7Yvei1qZiD9jq
57oYZVqTUNCTPxi6NeTOdEc+YqNynWNArx4PHh38LT0bqKtlZCGHNfoAJLPVYhbB
b2AHj9edYtHU9AAFSIy+HstET6P0UDxy02IeyE2yxoUBqdlXyv6FL44E+wARAQAB
zRxMYXVuY2hwYWQgUFBBIGZvciBVYnVudHUgSFBDwsGOBBMBCgA4FiEErocSHcPk
oLD4H/Aj9tDF1ca+s3sFAmTuZb8CGwMFCwkIBwIGFQoJCAsCBBYCAwECHgECF4AA
CgkQ9tDF1ca+s3sz3w//RNawsgydrutcbKf0yphDhzWS53wgfrs2KF1KgB0u/H+u
6Kn2C6jrVM0vuY4NKpbEPCduOj21pTCepL6PoCLv++tICOLVok5wY7Zn3WQFq0js
Iy1wO5t3kA1cTD/05v/qQVBGZ2j4DsJo33iMcQS5AjHvSr0nu7XSvDDEE3cQE55D
87vL7lgGjuTOikPh5FpCoS1gpemBfwm2Lbm4P8vGOA4/witRjGgfC1fv1idUnZLM
TbGrDlhVie8pX2kgB6yTYbJ3P3kpC1ZPpXSRWO/cQ8xoYpLBTXOOtqwZZUnxyzHh
gM+hv42vPTOnCo+apD97/VArsp59pDqEVoAtMTk72fdBqR+BB77g2hBkKESgQIEq
EiE1/TOISioMkE0AuUdaJ2ebyQXugSHHuBaqbEC47v8t5DVN5Qr9OriuzCuSDNFn
6SBHpahN9ZNi9w0A/Yh1+lFfpkVw2t04Q2LNuupqOpW+h3/62AeUqjUIAIrmfeML
IDRE2VdquYdIXKuhNvfpJYGdyvx/wAbiAeBWg0uPSepwTfTG59VPQmj0FtalkMnN
ya2212K5q68O5eXOfCnGeMvqIXxqzpdukxSZnLkgk40uFJnJVESd/CxHquqHPUDE
fy6i2AnB3kUI27D4HY2YSlXLSRbjiSxTfVwNCzDsIh7Czefsm6ITK2+cVWs0hNQ=
=cs1s
-----END PGP PUBLIC KEY BLOCK-----
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit]: Typically I like to have all constants declared at the beginning of the file after the import statements. It just flows better in my mind since I know where to refer to all constants within a particular module 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just move them inside the install function, since I think it's better to have them closer to its usage.

Comment on lines 479 to 493
def enable(self) -> None:
"""Enable service.

Raises:
SlurmOpsError: Raised if `systemctl enable ...` returns a non-zero returncode.
"""
_call("systemctl", "enable", "--now", self._service.value)

def disable(self) -> None:
"""Disable service."""
_call("systemctl", "disable", "--now", self._service.value)

def restart(self) -> None:
"""Restart service."""
_call("systemctl", "reload-or-restart", self._service.value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit]: Could we potentially use a partial function here?

_systemctl = functools.partial(_call, "systemctl")

Comment on lines 497 to 508
cmd = ["systemctl", "is-active", "--quiet", self._service.value]
_logger.debug(f"Executing command {cmd}")
proc = subprocess.run(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
text=True,
bufsize=1,
encoding="utf-8",
)
_logger.debug(f"command {cmd} exit code: {proc.returncode}. output:\n{proc.stdout}")
return proc.returncode == 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit]: I'm wondering if we should potentially just modify _call here to also return the exit code of a subprocess as well? Like Tuple[str, int]. That way we don't need a different way for calling is-active. Let me know your thoughts here.

I know that this method is based off of the implementation from the systemd charm library 😅

lib/charms/hpc_libs/v0/slurm_ops.py Outdated Show resolved Hide resolved
lib/charms/hpc_libs/v0/slurm_ops.py Outdated Show resolved Hide resolved
lib/charms/hpc_libs/v0/slurm_ops.py Outdated Show resolved Hide resolved
Modifies the integration tests to also test AptManager.
Copy link
Member

@NucciTheBoss NucciTheBoss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on this!

I'm happy with how this library has come together. I made a couple "nit/foresight"-level comments, but these are things that can be addressed in the future as part of gradual improvements to the Slurm charms

There are some formatting issues with the PPA keys, so I'm just going to fix them quickly with suggestions, and then open a PR to bump the version of slurm_ops for Charmhub.

lib/charms/hpc_libs/v0/slurm_ops.py Outdated Show resolved Hide resolved
lib/charms/hpc_libs/v0/slurm_ops.py Outdated Show resolved Hide resolved
lib/charms/hpc_libs/v0/slurm_ops.py Outdated Show resolved Hide resolved
lib/charms/hpc_libs/v0/slurm_ops.py Outdated Show resolved Hide resolved
Comment on lines +207 to +217
def _mungectl(self, *args: str, stdin: Optional[str] = None) -> str:
"""Control munge via `mungectl ...`.

Args:
*args: Arguments to pass to `mungectl`.
stdin: Input to pass to `mungectl` via stdin.

Raises:
subprocess.CalledProcessError: Raised if `mungectl` command fails.
"""
return _call("mungectl", *args, stdin=stdin).stdout
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit]: Does this method need to be part of this class since it's a static method?

Currently a nit because I don't think that this needs to be addressed right now, but something we should be aware of later when we spend time stabilizing the API for this library. For now I we just need something good that works 😄

)

@property
def slurm_path(self) -> Path:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[foresight]: I wonder if eventually we'll want to call this property etc_path rather than slurm_path so it's more specific which Slurm-related directory is being returned, but right now slurm_path works for me 👍

@NucciTheBoss NucciTheBoss merged commit 8233bb1 into charmed-hpc:main Sep 16, 2024
4 checks passed
@jedel1043 jedel1043 deleted the deb-manager branch September 16, 2024 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants